Overview
Brought to you by YData
Dataset statistics
| Number of variables | 29 |
|---|---|
| Number of observations | 212354 |
| Missing cells | 42486 |
| Missing cells (%) | 0.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 47.0 MiB |
| Average record size in memory | 232.0 B |
Variable types
| Numeric | 13 |
|---|---|
| Categorical | 16 |
biopsy_results is highly imbalanced (53.3%) | Imbalance |
label is highly imbalanced (53.5%) | Imbalance |
existing_conditions has 42486 (20.0%) missing values | Missing |
diana_microt has unique values | Unique |
elmmo has unique values | Unique |
microcosm has unique values | Unique |
miranda has unique values | Unique |
mirdb has unique values | Unique |
pictar has unique values | Unique |
pita has unique values | Unique |
targetscan has unique values | Unique |
predicted.sum has unique values | Unique |
all.sum has unique values | Unique |
Reproduction
| Analysis started | 2025-02-23 06:41:54.969601 |
|---|---|
| Analysis finished | 2025-02-23 06:42:28.844547 |
| Duration | 33.87 seconds |
| Software version | ydata-profiling vv4.12.2 |
| Download configuration | config.json |
Variables
age
Real number (ℝ)
| Distinct | 70 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.258036 |
| Minimum | 20 |
|---|---|
| Maximum | 89 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 26 |
| Q1 | 37 |
| median | 50 |
| Q3 | 69 |
| 95-th percentile | 85 |
| Maximum | 89 |
| Range | 69 |
| Interquartile range (IQR) | 32 |
Descriptive statistics
| Standard deviation | 18.984419 |
|---|---|
| Coefficient of variation (CV) | 0.35646112 |
| Kurtosis | -1.0959819 |
| Mean | 53.258036 |
| Median Absolute Deviation (MAD) | 15 |
| Skewness | 0.24603705 |
| Sum | 11309557 |
| Variance | 360.40817 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 41 | 4549 | 2.1% |
| 34 | 4518 | 2.1% |
| 42 | 4505 | 2.1% |
| 33 | 4492 | 2.1% |
| 36 | 4480 | 2.1% |
| 37 | 4475 | 2.1% |
| 47 | 4464 | 2.1% |
| 30 | 4457 | 2.1% |
| 46 | 4417 | 2.1% |
| 48 | 4408 | 2.1% |
| Other values (60) | 167589 |
| Value | Count | Frequency (%) |
| 20 | 1755 | |
| 21 | 1762 | |
| 22 | 1724 | |
| 23 | 1707 | |
| 24 | 1795 | |
| 25 | 1750 | |
| 26 | 1817 | |
| 27 | 1820 | |
| 28 | 1860 | |
| 29 | 1736 |
| Value | Count | Frequency (%) |
| 89 | 2645 | |
| 88 | 2563 | |
| 87 | 2717 | |
| 86 | 2575 | |
| 85 | 2609 | |
| 84 | 2712 | |
| 83 | 2630 | |
| 82 | 2615 | |
| 81 | 2673 | |
| 80 | 2710 |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.5994519 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Male |
|---|---|
| 2nd row | Female |
| 3rd row | Male |
| 4th row | Female |
| 5th row | Male |
Common Values
| Value | Count | Frequency (%) |
| Male | 148706 | |
| Female | 63648 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 148706 | |
| female | 63648 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 276002 | |
| a | 212354 | |
| l | 212354 | |
| M | 148706 | |
| F | 63648 | 6.5% |
| m | 63648 | 6.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 976712 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 276002 | |
| a | 212354 | |
| l | 212354 | |
| M | 148706 | |
| F | 63648 | 6.5% |
| m | 63648 | 6.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 976712 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 276002 | |
| a | 212354 | |
| l | 212354 | |
| M | 148706 | |
| F | 63648 | 6.5% |
| m | 63648 | 6.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 976712 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 276002 | |
| a | 212354 | |
| l | 212354 | |
| M | 148706 | |
| F | 63648 | 6.5% |
| m | 63648 | 6.5% |
ethnicity
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| Ethnicity_A | |
|---|---|
| Ethnicity_B | |
| Ethnicity_C |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Ethnicity_A |
|---|---|
| 2nd row | Ethnicity_B |
| 3rd row | Ethnicity_A |
| 4th row | Ethnicity_A |
| 5th row | Ethnicity_A |
Common Values
| Value | Count | Frequency (%) |
| Ethnicity_A | 127571 | |
| Ethnicity_B | 63569 | |
| Ethnicity_C | 21214 | 10.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| ethnicity_a | 127571 | |
| ethnicity_b | 63569 | |
| ethnicity_c | 21214 | 10.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 424708 | |
| i | 424708 | |
| E | 212354 | |
| h | 212354 | |
| n | 212354 | |
| c | 212354 | |
| y | 212354 | |
| _ | 212354 | |
| A | 127571 | 5.5% |
| B | 63569 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2335894 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 424708 | |
| i | 424708 | |
| E | 212354 | |
| h | 212354 | |
| n | 212354 | |
| c | 212354 | |
| y | 212354 | |
| _ | 212354 | |
| A | 127571 | 5.5% |
| B | 63569 | 2.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2335894 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 424708 | |
| i | 424708 | |
| E | 212354 | |
| h | 212354 | |
| n | 212354 | |
| c | 212354 | |
| y | 212354 | |
| _ | 212354 | |
| A | 127571 | 5.5% |
| B | 63569 | 2.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2335894 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 424708 | |
| i | 424708 | |
| E | 212354 | |
| h | 212354 | |
| n | 212354 | |
| c | 212354 | |
| y | 212354 | |
| _ | 212354 | |
| A | 127571 | 5.5% |
| B | 63569 | 2.7% |
geographical_location
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| California | |
|---|---|
| Other |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 8.9934025 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Other |
|---|---|
| 2nd row | California |
| 3rd row | California |
| 4th row | Other |
| 5th row | California |
Common Values
| Value | Count | Frequency (%) |
| California | 169603 | |
| Other | 42751 | 20.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| california | 169603 | |
| other | 42751 | 20.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 339206 | |
| i | 339206 | |
| r | 212354 | |
| C | 169603 | |
| l | 169603 | |
| f | 169603 | |
| o | 169603 | |
| n | 169603 | |
| O | 42751 | 2.2% |
| t | 42751 | 2.2% |
| Other values (2) | 85502 | 4.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1909785 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 339206 | |
| i | 339206 | |
| r | 212354 | |
| C | 169603 | |
| l | 169603 | |
| f | 169603 | |
| o | 169603 | |
| n | 169603 | |
| O | 42751 | 2.2% |
| t | 42751 | 2.2% |
| Other values (2) | 85502 | 4.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1909785 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 339206 | |
| i | 339206 | |
| r | 212354 | |
| C | 169603 | |
| l | 169603 | |
| f | 169603 | |
| o | 169603 | |
| n | 169603 | |
| O | 42751 | 2.2% |
| t | 42751 | 2.2% |
| Other values (2) | 85502 | 4.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1909785 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 339206 | |
| i | 339206 | |
| r | 212354 | |
| C | 169603 | |
| l | 169603 | |
| f | 169603 | |
| o | 169603 | |
| n | 169603 | |
| O | 42751 | 2.2% |
| t | 42751 | 2.2% |
| Other values (2) | 85502 | 4.5% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 148528 | |
| 1 | 63826 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 148528 | |
| 1 | 63826 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 148528 | |
| 1 | 63826 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 148528 | |
| 1 | 63826 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 148528 | |
| 1 | 63826 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 148528 | |
| 1 | 63826 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 127592 | |
| 1 | 84762 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 127592 | |
| 1 | 84762 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 127592 | |
| 1 | 84762 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 127592 | |
| 1 | 84762 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 127592 | |
| 1 | 84762 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 127592 | |
| 1 | 84762 |
alcohol_consumption
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 106309 | |
| 1 | 106045 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 106309 | |
| 1 | 106045 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 106309 | |
| 1 | 106045 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 106309 | |
| 1 | 106045 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 106309 | |
| 1 | 106045 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 106309 | |
| 1 | 106045 |
helicobacter_pylori_infection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 159378 | |
| 1 | 52976 | 24.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 159378 | |
| 1 | 52976 | 24.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 159378 | |
| 1 | 52976 | 24.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 159378 | |
| 1 | 52976 | 24.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 159378 | |
| 1 | 52976 | 24.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 159378 | |
| 1 | 52976 | 24.9% |
dietary_habits
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| High_Salt | |
|---|---|
| Low_Salt |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.7996317 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Low_Salt |
|---|---|
| 2nd row | High_Salt |
| 3rd row | High_Salt |
| 4th row | High_Salt |
| 5th row | High_Salt |
Common Values
| Value | Count | Frequency (%) |
| High_Salt | 169805 | |
| Low_Salt | 42549 | 20.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| high_salt | 169805 | |
| low_salt | 42549 | 20.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 212354 | |
| a | 212354 | |
| S | 212354 | |
| _ | 212354 | |
| t | 212354 | |
| g | 169805 | |
| i | 169805 | |
| H | 169805 | |
| h | 169805 | |
| L | 42549 | 2.3% |
| Other values (2) | 85098 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1868637 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| l | 212354 | |
| a | 212354 | |
| S | 212354 | |
| _ | 212354 | |
| t | 212354 | |
| g | 169805 | |
| i | 169805 | |
| H | 169805 | |
| h | 169805 | |
| L | 42549 | 2.3% |
| Other values (2) | 85098 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1868637 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| l | 212354 | |
| a | 212354 | |
| S | 212354 | |
| _ | 212354 | |
| t | 212354 | |
| g | 169805 | |
| i | 169805 | |
| H | 169805 | |
| h | 169805 | |
| L | 42549 | 2.3% |
| Other values (2) | 85098 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1868637 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| l | 212354 | |
| a | 212354 | |
| S | 212354 | |
| _ | 212354 | |
| t | 212354 | |
| g | 169805 | |
| i | 169805 | |
| H | 169805 | |
| h | 169805 | |
| L | 42549 | 2.3% |
| Other values (2) | 85098 |
existing_conditions
Categorical
Missing 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 42486 |
| Missing (%) | 20.0% |
| Memory size | 1.6 MiB |
| Chronic Gastritis | |
|---|---|
| Diabetes |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 13.632497 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Chronic Gastritis |
|---|---|
| 2nd row | Diabetes |
| 3rd row | Chronic Gastritis |
| 4th row | Diabetes |
| 5th row | Chronic Gastritis |
Common Values
| Value | Count | Frequency (%) |
| Chronic Gastritis | 106309 | |
| Diabetes | 63559 | |
| (Missing) | 42486 | 20.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| chronic | 106309 | |
| gastritis | 106309 | |
| diabetes | 63559 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 382486 | |
| t | 276177 | |
| s | 276177 | |
| r | 212618 | |
| a | 169868 | 7.3% |
| e | 127118 | 5.5% |
| C | 106309 | 4.6% |
| c | 106309 | 4.6% |
| n | 106309 | 4.6% |
| o | 106309 | 4.6% |
| Other values (5) | 446045 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2315725 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 382486 | |
| t | 276177 | |
| s | 276177 | |
| r | 212618 | |
| a | 169868 | 7.3% |
| e | 127118 | 5.5% |
| C | 106309 | 4.6% |
| c | 106309 | 4.6% |
| n | 106309 | 4.6% |
| o | 106309 | 4.6% |
| Other values (5) | 446045 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2315725 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 382486 | |
| t | 276177 | |
| s | 276177 | |
| r | 212618 | |
| a | 169868 | 7.3% |
| e | 127118 | 5.5% |
| C | 106309 | 4.6% |
| c | 106309 | 4.6% |
| n | 106309 | 4.6% |
| o | 106309 | 4.6% |
| Other values (5) | 446045 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2315725 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 382486 | |
| t | 276177 | |
| s | 276177 | |
| r | 212618 | |
| a | 169868 | 7.3% |
| e | 127118 | 5.5% |
| C | 106309 | 4.6% |
| c | 106309 | 4.6% |
| n | 106309 | 4.6% |
| o | 106309 | 4.6% |
| Other values (5) | 446045 |
endoscopic_images
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| Normal | |
|---|---|
| Abnormal |
Length
| Max length | 8 |
|---|---|
| Median length | 6 |
| Mean length | 6.6011754 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Normal |
|---|---|
| 2nd row | Normal |
| 3rd row | Normal |
| 4th row | Normal |
| 5th row | Abnormal |
Common Values
| Value | Count | Frequency (%) |
| Normal | 148523 | |
| Abnormal | 63831 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| normal | 148523 | |
| abnormal | 63831 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 212354 | |
| l | 212354 | |
| r | 212354 | |
| m | 212354 | |
| a | 212354 | |
| N | 148523 | |
| A | 63831 | 4.6% |
| b | 63831 | 4.6% |
| n | 63831 | 4.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1401786 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 212354 | |
| l | 212354 | |
| r | 212354 | |
| m | 212354 | |
| a | 212354 | |
| N | 148523 | |
| A | 63831 | 4.6% |
| b | 63831 | 4.6% |
| n | 63831 | 4.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1401786 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 212354 | |
| l | 212354 | |
| r | 212354 | |
| m | 212354 | |
| a | 212354 | |
| N | 148523 | |
| A | 63831 | 4.6% |
| b | 63831 | 4.6% |
| n | 63831 | 4.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1401786 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 212354 | |
| l | 212354 | |
| r | 212354 | |
| m | 212354 | |
| a | 212354 | |
| N | 148523 | |
| A | 63831 | 4.6% |
| b | 63831 | 4.6% |
| n | 63831 | 4.6% |
biopsy_results
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| Negative | |
|---|---|
| Positive |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Negative |
|---|---|
| 2nd row | Negative |
| 3rd row | Negative |
| 4th row | Negative |
| 5th row | Negative |
Common Values
| Value | Count | Frequency (%) |
| Negative | 191223 | |
| Positive | 21131 | 10.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| negative | 191223 | |
| positive | 21131 | 10.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 403577 | |
| i | 233485 | |
| v | 212354 | |
| t | 212354 | |
| N | 191223 | |
| a | 191223 | |
| g | 191223 | |
| P | 21131 | 1.2% |
| o | 21131 | 1.2% |
| s | 21131 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 403577 | |
| i | 233485 | |
| v | 212354 | |
| t | 212354 | |
| N | 191223 | |
| a | 191223 | |
| g | 191223 | |
| P | 21131 | 1.2% |
| o | 21131 | 1.2% |
| s | 21131 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 403577 | |
| i | 233485 | |
| v | 212354 | |
| t | 212354 | |
| N | 191223 | |
| a | 191223 | |
| g | 191223 | |
| P | 21131 | 1.2% |
| o | 21131 | 1.2% |
| s | 21131 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 403577 | |
| i | 233485 | |
| v | 212354 | |
| t | 212354 | |
| N | 191223 | |
| a | 191223 | |
| g | 191223 | |
| P | 21131 | 1.2% |
| o | 21131 | 1.2% |
| s | 21131 | 1.2% |
ct_scan
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| Negative | |
|---|---|
| Positive |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Negative |
|---|---|
| 2nd row | Negative |
| 3rd row | Negative |
| 4th row | Negative |
| 5th row | Negative |
Common Values
| Value | Count | Frequency (%) |
| Negative | 169740 | |
| Positive | 42614 | 20.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| negative | 169740 | |
| positive | 42614 | 20.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 382094 | |
| i | 254968 | |
| v | 212354 | |
| t | 212354 | |
| N | 169740 | |
| a | 169740 | |
| g | 169740 | |
| P | 42614 | 2.5% |
| o | 42614 | 2.5% |
| s | 42614 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 382094 | |
| i | 254968 | |
| v | 212354 | |
| t | 212354 | |
| N | 169740 | |
| a | 169740 | |
| g | 169740 | |
| P | 42614 | 2.5% |
| o | 42614 | 2.5% |
| s | 42614 | 2.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 382094 | |
| i | 254968 | |
| v | 212354 | |
| t | 212354 | |
| N | 169740 | |
| a | 169740 | |
| g | 169740 | |
| P | 42614 | 2.5% |
| o | 42614 | 2.5% |
| s | 42614 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 382094 | |
| i | 254968 | |
| v | 212354 | |
| t | 212354 | |
| N | 169740 | |
| a | 169740 | |
| g | 169740 | |
| P | 42614 | 2.5% |
| o | 42614 | 2.5% |
| s | 42614 | 2.5% |
mature_mirna_acc
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| MIR123 | |
|---|---|
| MIR234 | |
| MIR345 |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | MIR123 |
|---|---|
| 2nd row | MIR123 |
| 3rd row | MIR345 |
| 4th row | MIR123 |
| 5th row | MIR345 |
Common Values
| Value | Count | Frequency (%) |
| MIR123 | 148596 | |
| MIR234 | 42474 | 20.0% |
| MIR345 | 21284 | 10.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| mir123 | 148596 | |
| mir234 | 42474 | 20.0% |
| mir345 | 21284 | 10.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 212354 | |
| I | 212354 | |
| R | 212354 | |
| 3 | 212354 | |
| 2 | 191070 | |
| 1 | 148596 | |
| 4 | 63758 | 5.0% |
| 5 | 21284 | 1.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1274124 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| M | 212354 | |
| I | 212354 | |
| R | 212354 | |
| 3 | 212354 | |
| 2 | 191070 | |
| 1 | 148596 | |
| 4 | 63758 | 5.0% |
| 5 | 21284 | 1.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1274124 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| M | 212354 | |
| I | 212354 | |
| R | 212354 | |
| 3 | 212354 | |
| 2 | 191070 | |
| 1 | 148596 | |
| 4 | 63758 | 5.0% |
| 5 | 21284 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1274124 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| M | 212354 | |
| I | 212354 | |
| R | 212354 | |
| 3 | 212354 | |
| 2 | 191070 | |
| 1 | 148596 | |
| 4 | 63758 | 5.0% |
| 5 | 21284 | 1.7% |
mature_mirna_id
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| MIR123_1 | |
|---|---|
| MIR234_2 | |
| MIR345_3 |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | MIR123_1 |
|---|---|
| 2nd row | MIR234_2 |
| 3rd row | MIR345_3 |
| 4th row | MIR123_1 |
| 5th row | MIR123_1 |
Common Values
| Value | Count | Frequency (%) |
| MIR123_1 | 148248 | |
| MIR234_2 | 42780 | 20.1% |
| MIR345_3 | 21326 | 10.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| mir123_1 | 148248 | |
| mir234_2 | 42780 | 20.1% |
| mir345_3 | 21326 | 10.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 296496 | |
| 2 | 233808 | |
| 3 | 233680 | |
| M | 212354 | |
| R | 212354 | |
| I | 212354 | |
| _ | 212354 | |
| 4 | 64106 | 3.8% |
| 5 | 21326 | 1.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 296496 | |
| 2 | 233808 | |
| 3 | 233680 | |
| M | 212354 | |
| R | 212354 | |
| I | 212354 | |
| _ | 212354 | |
| 4 | 64106 | 3.8% |
| 5 | 21326 | 1.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 296496 | |
| 2 | 233808 | |
| 3 | 233680 | |
| M | 212354 | |
| R | 212354 | |
| I | 212354 | |
| _ | 212354 | |
| 4 | 64106 | 3.8% |
| 5 | 21326 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1698832 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 296496 | |
| 2 | 233808 | |
| 3 | 233680 | |
| M | 212354 | |
| R | 212354 | |
| I | 212354 | |
| _ | 212354 | |
| 4 | 64106 | 3.8% |
| 5 | 21326 | 1.3% |
target_symbol
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| TP53 | |
|---|---|
| CDH1 | |
| KRAS |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | TP53 |
|---|---|
| 2nd row | TP53 |
| 3rd row | KRAS |
| 4th row | KRAS |
| 5th row | CDH1 |
Common Values
| Value | Count | Frequency (%) |
| TP53 | 106351 | |
| CDH1 | 63570 | |
| KRAS | 42433 | 20.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| tp53 | 106351 | |
| cdh1 | 63570 | |
| kras | 42433 | 20.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| T | 106351 | |
| P | 106351 | |
| 5 | 106351 | |
| 3 | 106351 | |
| C | 63570 | |
| D | 63570 | |
| H | 63570 | |
| 1 | 63570 | |
| K | 42433 | 5.0% |
| R | 42433 | 5.0% |
| Other values (2) | 84866 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 849416 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| T | 106351 | |
| P | 106351 | |
| 5 | 106351 | |
| 3 | 106351 | |
| C | 63570 | |
| D | 63570 | |
| H | 63570 | |
| 1 | 63570 | |
| K | 42433 | 5.0% |
| R | 42433 | 5.0% |
| Other values (2) | 84866 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 849416 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| T | 106351 | |
| P | 106351 | |
| 5 | 106351 | |
| 3 | 106351 | |
| C | 63570 | |
| D | 63570 | |
| H | 63570 | |
| 1 | 63570 | |
| K | 42433 | 5.0% |
| R | 42433 | 5.0% |
| Other values (2) | 84866 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 849416 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| T | 106351 | |
| P | 106351 | |
| 5 | 106351 | |
| 3 | 106351 | |
| C | 63570 | |
| D | 63570 | |
| H | 63570 | |
| 1 | 63570 | |
| K | 42433 | 5.0% |
| R | 42433 | 5.0% |
| Other values (2) | 84866 |
target_entrez
Real number (ℝ)
| Distinct | 9000 |
|---|---|
| Distinct (%) | 4.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5496.1173 |
| Minimum | 1000 |
|---|---|
| Maximum | 9999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 1449 |
| Q1 | 3252 |
| median | 5491 |
| Q3 | 7738 |
| 95-th percentile | 9547 |
| Maximum | 9999 |
| Range | 8999 |
| Interquartile range (IQR) | 4486 |
Descriptive statistics
| Standard deviation | 2596.8176 |
|---|---|
| Coefficient of variation (CV) | 0.4724822 |
| Kurtosis | -1.1970572 |
| Mean | 5496.1173 |
| Median Absolute Deviation (MAD) | 2243 |
| Skewness | 0.0011763769 |
| Sum | 1.1671225 × 109 |
| Variance | 6743461.8 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5737 | 44 | < 0.1% |
| 8008 | 43 | < 0.1% |
| 1341 | 43 | < 0.1% |
| 3530 | 42 | < 0.1% |
| 6519 | 41 | < 0.1% |
| 4279 | 40 | < 0.1% |
| 1046 | 40 | < 0.1% |
| 4050 | 40 | < 0.1% |
| 9428 | 40 | < 0.1% |
| 7616 | 40 | < 0.1% |
| Other values (8990) | 211941 |
| Value | Count | Frequency (%) |
| 1000 | 30 | |
| 1001 | 22 | |
| 1002 | 26 | |
| 1003 | 19 | |
| 1004 | 21 | |
| 1005 | 27 | |
| 1006 | 28 | |
| 1007 | 32 | |
| 1008 | 24 | |
| 1009 | 25 |
| Value | Count | Frequency (%) |
| 9999 | 25 | |
| 9998 | 23 | |
| 9997 | 23 | |
| 9996 | 20 | |
| 9995 | 27 | |
| 9994 | 28 | |
| 9993 | 24 | |
| 9992 | 27 | |
| 9991 | 21 | |
| 9990 | 22 |
target_ensembl
Real number (ℝ)
| Distinct | 191184 |
|---|---|
| Distinct (%) | 90.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1499874 |
| Minimum | 1000007 |
|---|---|
| Maximum | 1999998 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1000007 |
|---|---|
| 5-th percentile | 1049352.6 |
| Q1 | 1251177 |
| median | 1500323.5 |
| Q3 | 1748423.5 |
| 95-th percentile | 1949640.4 |
| Maximum | 1999998 |
| Range | 999991 |
| Interquartile range (IQR) | 497246.5 |
Descriptive statistics
| Standard deviation | 288258.53 |
|---|---|
| Coefficient of variation (CV) | 0.19218849 |
| Kurtosis | -1.193807 |
| Mean | 1499874 |
| Median Absolute Deviation (MAD) | 248674.5 |
| Skewness | -0.0013157608 |
| Sum | 3.1850425 × 1011 |
| Variance | 8.3092979 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1108334 | 5 | < 0.1% |
| 1621017 | 5 | < 0.1% |
| 1228716 | 5 | < 0.1% |
| 1686987 | 5 | < 0.1% |
| 1471937 | 5 | < 0.1% |
| 1396220 | 5 | < 0.1% |
| 1307868 | 4 | < 0.1% |
| 1567669 | 4 | < 0.1% |
| 1465615 | 4 | < 0.1% |
| 1969936 | 4 | < 0.1% |
| Other values (191174) | 212308 |
| Value | Count | Frequency (%) |
| 1000007 | 1 | |
| 1000009 | 1 | |
| 1000012 | 1 | |
| 1000014 | 1 | |
| 1000029 | 1 | |
| 1000030 | 1 | |
| 1000038 | 1 | |
| 1000039 | 1 | |
| 1000048 | 1 | |
| 1000058 | 1 |
| Value | Count | Frequency (%) |
| 1999998 | 1 | |
| 1999994 | 1 | |
| 1999987 | 1 | |
| 1999979 | 2 | |
| 1999977 | 1 | |
| 1999976 | 1 | |
| 1999964 | 2 | |
| 1999957 | 1 | |
| 1999954 | 1 | |
| 1999951 | 1 |
diana_microt
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.50052688 |
| Minimum | 5.4594294 × 10-6 |
|---|---|
| Maximum | 0.99999837 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 5.4594294 × 10-6 |
|---|---|
| 5-th percentile | 0.050329547 |
| Q1 | 0.25003383 |
| median | 0.50130179 |
| Q3 | 0.75078292 |
| 95-th percentile | 0.95008032 |
| Maximum | 0.99999837 |
| Range | 0.99999291 |
| Interquartile range (IQR) | 0.50074908 |
Descriptive statistics
| Standard deviation | 0.28875716 |
|---|---|
| Coefficient of variation (CV) | 0.5769064 |
| Kurtosis | -1.2013047 |
| Mean | 0.50052688 |
| Median Absolute Deviation (MAD) | 0.25039258 |
| Skewness | -0.0032353963 |
| Sum | 106288.89 |
| Variance | 0.083380699 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.1469554288 | 1 | < 0.1% |
| 0.3813264408 | 1 | < 0.1% |
| 0.7766793261 | 1 | < 0.1% |
| 0.7422524583 | 1 | < 0.1% |
| 0.3561195763 | 1 | < 0.1% |
| 0.728199608 | 1 | < 0.1% |
| 0.3698819345 | 1 | < 0.1% |
| 0.3578230727 | 1 | < 0.1% |
| 0.09172957349 | 1 | < 0.1% |
| 0.4705122224 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 5.459429415 × 10-6 | 1 | |
| 1.378755261 × 10-5 | 1 | |
| 1.544225826 × 10-5 | 1 | |
| 1.603319656 × 10-5 | 1 | |
| 1.683744661 × 10-5 | 1 | |
| 1.903288069 × 10-5 | 1 | |
| 2.492624933 × 10-5 | 1 | |
| 3.036230069 × 10-5 | 1 | |
| 3.428841101 × 10-5 | 1 | |
| 4.738306033 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999983666 | 1 | |
| 0.9999924808 | 1 | |
| 0.9999890822 | 1 | |
| 0.9999877113 | 1 | |
| 0.9999828503 | 1 | |
| 0.9999783812 | 1 | |
| 0.9999769527 | 1 | |
| 0.9999731289 | 1 | |
| 0.9999668912 | 1 | |
| 0.9999661912 | 1 |
elmmo
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.49949712 |
| Minimum | 6.0890625 × 10-7 |
|---|---|
| Maximum | 0.9999993 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 6.0890625 × 10-7 |
|---|---|
| 5-th percentile | 0.049287588 |
| Q1 | 0.24864684 |
| median | 0.49980396 |
| Q3 | 0.74958433 |
| 95-th percentile | 0.95025324 |
| Maximum | 0.9999993 |
| Range | 0.99999869 |
| Interquartile range (IQR) | 0.50093749 |
Descriptive statistics
| Standard deviation | 0.28884517 |
|---|---|
| Coefficient of variation (CV) | 0.57827194 |
| Kurtosis | -1.2006813 |
| Mean | 0.49949712 |
| Median Absolute Deviation (MAD) | 0.25045947 |
| Skewness | 0.0023044896 |
| Sum | 106070.21 |
| Variance | 0.083431534 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.6896956352 | 1 | < 0.1% |
| 0.1870025301 | 1 | < 0.1% |
| 0.6596272982 | 1 | < 0.1% |
| 0.5951774949 | 1 | < 0.1% |
| 0.4263553566 | 1 | < 0.1% |
| 0.6901327169 | 1 | < 0.1% |
| 0.7974740682 | 1 | < 0.1% |
| 0.3561960374 | 1 | < 0.1% |
| 0.2488704074 | 1 | < 0.1% |
| 0.009788746696 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 6.089062459 × 10-7 | 1 | |
| 6.997383498 × 10-6 | 1 | |
| 7.858921587 × 10-6 | 1 | |
| 1.303431662 × 10-5 | 1 | |
| 2.214603532 × 10-5 | 1 | |
| 2.447602945 × 10-5 | 1 | |
| 2.987345433 × 10-5 | 1 | |
| 3.18633731 × 10-5 | 1 | |
| 4.070462222 × 10-5 | 1 | |
| 4.208724283 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999993036 | 1 | |
| 0.9999955035 | 1 | |
| 0.9999914694 | 1 | |
| 0.999988929 | 1 | |
| 0.9999880918 | 1 | |
| 0.9999788182 | 1 | |
| 0.9999704213 | 1 | |
| 0.9999562781 | 1 | |
| 0.9999518208 | 1 | |
| 0.999949995 | 1 |
microcosm
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.50008326 |
| Minimum | 2.8074476 × 10-6 |
|---|---|
| Maximum | 0.99999738 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 2.8074476 × 10-6 |
|---|---|
| 5-th percentile | 0.050159621 |
| Q1 | 0.24987698 |
| median | 0.50049801 |
| Q3 | 0.74876926 |
| 95-th percentile | 0.95042567 |
| Maximum | 0.99999738 |
| Range | 0.99999457 |
| Interquartile range (IQR) | 0.49889228 |
Descriptive statistics
| Standard deviation | 0.28849093 |
|---|---|
| Coefficient of variation (CV) | 0.5768858 |
| Kurtosis | -1.1968691 |
| Mean | 0.50008326 |
| Median Absolute Deviation (MAD) | 0.2494055 |
| Skewness | -0.00043232698 |
| Sum | 106194.68 |
| Variance | 0.083227017 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.7015243326 | 1 | < 0.1% |
| 0.7864218237 | 1 | < 0.1% |
| 0.4906556553 | 1 | < 0.1% |
| 0.9975424276 | 1 | < 0.1% |
| 0.9297736416 | 1 | < 0.1% |
| 0.05440083629 | 1 | < 0.1% |
| 0.3709832071 | 1 | < 0.1% |
| 0.8738324239 | 1 | < 0.1% |
| 0.4456450196 | 1 | < 0.1% |
| 0.3298512183 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 2.807447594 × 10-6 | 1 | |
| 3.632722 × 10-6 | 1 | |
| 8.553127314 × 10-6 | 1 | |
| 1.337573821 × 10-5 | 1 | |
| 1.567621956 × 10-5 | 1 | |
| 1.681754038 × 10-5 | 1 | |
| 2.162619995 × 10-5 | 1 | |
| 2.66833141 × 10-5 | 1 | |
| 3.497205066 × 10-5 | 1 | |
| 3.568488452 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999973799 | 1 | |
| 0.9999842374 | 1 | |
| 0.9999836936 | 1 | |
| 0.9999800567 | 1 | |
| 0.9999782066 | 1 | |
| 0.9999752968 | 1 | |
| 0.9999707887 | 1 | |
| 0.9999672694 | 1 | |
| 0.9999646969 | 1 | |
| 0.9999644794 | 1 |
miranda
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4998947 |
| Minimum | 1.1601668 × 10-6 |
|---|---|
| Maximum | 0.99999722 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1.1601668 × 10-6 |
|---|---|
| 5-th percentile | 0.049796691 |
| Q1 | 0.25020641 |
| median | 0.49963469 |
| Q3 | 0.74907498 |
| 95-th percentile | 0.94954477 |
| Maximum | 0.99999722 |
| Range | 0.99999606 |
| Interquartile range (IQR) | 0.49886857 |
Descriptive statistics
| Standard deviation | 0.28857114 |
|---|---|
| Coefficient of variation (CV) | 0.57726385 |
| Kurtosis | -1.1987745 |
| Mean | 0.4998947 |
| Median Absolute Deviation (MAD) | 0.24943107 |
| Skewness | -0.00047022689 |
| Sum | 106154.64 |
| Variance | 0.083273302 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.2509549596 | 1 | < 0.1% |
| 0.2048164354 | 1 | < 0.1% |
| 0.913230403 | 1 | < 0.1% |
| 0.6114164536 | 1 | < 0.1% |
| 0.5911479985 | 1 | < 0.1% |
| 0.5780028225 | 1 | < 0.1% |
| 0.1018984975 | 1 | < 0.1% |
| 0.3274011811 | 1 | < 0.1% |
| 0.2647108839 | 1 | < 0.1% |
| 0.4659180306 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 1.160166828 × 10-6 | 1 | |
| 2.177061828 × 10-6 | 1 | |
| 5.403161511 × 10-6 | 1 | |
| 9.935944034 × 10-6 | 1 | |
| 1.099262032 × 10-5 | 1 | |
| 1.768943477 × 10-5 | 1 | |
| 2.100137983 × 10-5 | 1 | |
| 2.453624145 × 10-5 | 1 | |
| 2.51735778 × 10-5 | 1 | |
| 2.678767066 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999972173 | 1 | |
| 0.9999962992 | 1 | |
| 0.9999951188 | 1 | |
| 0.9999932733 | 1 | |
| 0.9999887333 | 1 | |
| 0.9999772986 | 1 | |
| 0.9999726449 | 1 | |
| 0.9999704195 | 1 | |
| 0.9999560305 | 1 | |
| 0.9999533058 | 1 |
mirdb
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.50012512 |
| Minimum | 1.2888397 × 10-5 |
|---|---|
| Maximum | 0.99999969 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1.2888397 × 10-5 |
|---|---|
| 5-th percentile | 0.049837593 |
| Q1 | 0.25029266 |
| median | 0.49988755 |
| Q3 | 0.7499939 |
| 95-th percentile | 0.94979884 |
| Maximum | 0.99999969 |
| Range | 0.9999868 |
| Interquartile range (IQR) | 0.49970124 |
Descriptive statistics
| Standard deviation | 0.28873829 |
|---|---|
| Coefficient of variation (CV) | 0.57733212 |
| Kurtosis | -1.1992997 |
| Mean | 0.50012512 |
| Median Absolute Deviation (MAD) | 0.24987209 |
| Skewness | -0.00081665092 |
| Sum | 106203.57 |
| Variance | 0.083369803 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.1024580653 | 1 | < 0.1% |
| 0.5619198163 | 1 | < 0.1% |
| 0.1394716776 | 1 | < 0.1% |
| 0.4179716977 | 1 | < 0.1% |
| 0.9407641418 | 1 | < 0.1% |
| 0.8902321697 | 1 | < 0.1% |
| 0.7633443937 | 1 | < 0.1% |
| 0.1189429058 | 1 | < 0.1% |
| 0.7810754772 | 1 | < 0.1% |
| 0.9661873601 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 1.288839663 × 10-5 | 1 | |
| 1.735518527 × 10-5 | 1 | |
| 2.068202303 × 10-5 | 1 | |
| 2.509233614 × 10-5 | 1 | |
| 2.621049812 × 10-5 | 1 | |
| 2.824481228 × 10-5 | 1 | |
| 3.391776562 × 10-5 | 1 | |
| 4.578443329 × 10-5 | 1 | |
| 5.621847435 × 10-5 | 1 | |
| 6.030695304 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999996932 | 1 | |
| 0.9999984784 | 1 | |
| 0.9999962678 | 1 | |
| 0.9999867404 | 1 | |
| 0.9999852787 | 1 | |
| 0.999977196 | 1 | |
| 0.9999722823 | 1 | |
| 0.9999655619 | 1 | |
| 0.9999590561 | 1 | |
| 0.9999568651 | 1 |
pictar
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.49897453 |
| Minimum | 7.7439575 × 10-7 |
|---|---|
| Maximum | 0.99999862 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 7.7439575 × 10-7 |
|---|---|
| 5-th percentile | 0.049573943 |
| Q1 | 0.24979431 |
| median | 0.49852298 |
| Q3 | 0.74872122 |
| 95-th percentile | 0.9496059 |
| Maximum | 0.99999862 |
| Range | 0.99999785 |
| Interquartile range (IQR) | 0.49892691 |
Descriptive statistics
| Standard deviation | 0.28849949 |
|---|---|
| Coefficient of variation (CV) | 0.5781848 |
| Kurtosis | -1.1981275 |
| Mean | 0.49897453 |
| Median Absolute Deviation (MAD) | 0.24943054 |
| Skewness | 0.0046300914 |
| Sum | 105959.24 |
| Variance | 0.083231955 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.07702502958 | 1 | < 0.1% |
| 0.4381752163 | 1 | < 0.1% |
| 0.852955162 | 1 | < 0.1% |
| 0.2258819761 | 1 | < 0.1% |
| 0.6218615289 | 1 | < 0.1% |
| 0.7529753021 | 1 | < 0.1% |
| 0.8866593829 | 1 | < 0.1% |
| 0.6768941385 | 1 | < 0.1% |
| 0.3487345531 | 1 | < 0.1% |
| 0.8681888647 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 7.743957466 × 10-7 | 1 | |
| 2.537464936 × 10-6 | 1 | |
| 5.941009967 × 10-6 | 1 | |
| 7.404393862 × 10-6 | 1 | |
| 1.202365004 × 10-5 | 1 | |
| 1.287837196 × 10-5 | 1 | |
| 4.869594218 × 10-5 | 1 | |
| 5.776587314 × 10-5 | 1 | |
| 6.048936703 × 10-5 | 1 | |
| 7.553404638 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999986241 | 1 | |
| 0.9999898946 | 1 | |
| 0.9999775677 | 1 | |
| 0.9999711264 | 1 | |
| 0.9999437912 | 1 | |
| 0.9999419087 | 1 | |
| 0.9999386562 | 1 | |
| 0.9999369062 | 1 | |
| 0.9999287185 | 1 | |
| 0.9999262206 | 1 |
pita
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.50115233 |
| Minimum | 7.9775181 × 10-6 |
|---|---|
| Maximum | 0.99999347 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 7.9775181 × 10-6 |
|---|---|
| 5-th percentile | 0.050683607 |
| Q1 | 0.25032151 |
| median | 0.50222802 |
| Q3 | 0.75121219 |
| 95-th percentile | 0.95093408 |
| Maximum | 0.99999347 |
| Range | 0.9999855 |
| Interquartile range (IQR) | 0.50089068 |
Descriptive statistics
| Standard deviation | 0.2888743 |
|---|---|
| Coefficient of variation (CV) | 0.57642016 |
| Kurtosis | -1.2025394 |
| Mean | 0.50115233 |
| Median Absolute Deviation (MAD) | 0.25046699 |
| Skewness | -0.0026835509 |
| Sum | 106421.7 |
| Variance | 0.083448364 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.4637661991 | 1 | < 0.1% |
| 0.283602689 | 1 | < 0.1% |
| 0.3212623222 | 1 | < 0.1% |
| 0.2186850894 | 1 | < 0.1% |
| 0.4570374186 | 1 | < 0.1% |
| 0.955232418 | 1 | < 0.1% |
| 0.5025110953 | 1 | < 0.1% |
| 0.653453409 | 1 | < 0.1% |
| 0.3370602587 | 1 | < 0.1% |
| 0.2088513633 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 7.977518126 × 10-6 | 1 | |
| 8.180784783 × 10-6 | 1 | |
| 8.534000501 × 10-6 | 1 | |
| 1.69545237 × 10-5 | 1 | |
| 2.131069445 × 10-5 | 1 | |
| 2.258408668 × 10-5 | 1 | |
| 2.298550121 × 10-5 | 1 | |
| 2.636952295 × 10-5 | 1 | |
| 2.696162304 × 10-5 | 1 | |
| 3.438315388 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.9999934731 | 1 | |
| 0.9999829516 | 1 | |
| 0.9999820996 | 1 | |
| 0.9999776178 | 1 | |
| 0.999975547 | 1 | |
| 0.999973749 | 1 | |
| 0.9999733255 | 1 | |
| 0.9999714195 | 1 | |
| 0.9999693917 | 1 | |
| 0.9999668731 | 1 |
targetscan
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.501117 |
| Minimum | 4.1364528 × 10-6 |
|---|---|
| Maximum | 0.99999668 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 4.1364528 × 10-6 |
|---|---|
| 5-th percentile | 0.050122576 |
| Q1 | 0.25178984 |
| median | 0.50181385 |
| Q3 | 0.75186016 |
| 95-th percentile | 0.95036286 |
| Maximum | 0.99999668 |
| Range | 0.99999254 |
| Interquartile range (IQR) | 0.50007032 |
Descriptive statistics
| Standard deviation | 0.28867799 |
|---|---|
| Coefficient of variation (CV) | 0.57606903 |
| Kurtosis | -1.2000675 |
| Mean | 0.501117 |
| Median Absolute Deviation (MAD) | 0.25003186 |
| Skewness | -0.0050861131 |
| Sum | 106414.2 |
| Variance | 0.083334981 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.7651636218 | 1 | < 0.1% |
| 0.9282443263 | 1 | < 0.1% |
| 0.1561413195 | 1 | < 0.1% |
| 0.9269707731 | 1 | < 0.1% |
| 0.9061510761 | 1 | < 0.1% |
| 0.04937826627 | 1 | < 0.1% |
| 0.1097590095 | 1 | < 0.1% |
| 0.2483067053 | 1 | < 0.1% |
| 0.2265969148 | 1 | < 0.1% |
| 0.9442487152 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 4.136452795 × 10-6 | 1 | |
| 4.335402382 × 10-6 | 1 | |
| 4.461687015 × 10-6 | 1 | |
| 6.533224721 × 10-6 | 1 | |
| 1.218648272 × 10-5 | 1 | |
| 1.407818904 × 10-5 | 1 | |
| 1.808029239 × 10-5 | 1 | |
| 2.329879473 × 10-5 | 1 | |
| 2.517149427 × 10-5 | 1 | |
| 2.690187606 × 10-5 | 1 |
| Value | Count | Frequency (%) |
| 0.999996679 | 1 | |
| 0.9999911958 | 1 | |
| 0.9999892316 | 1 | |
| 0.9999883625 | 1 | |
| 0.9999881889 | 1 | |
| 0.9999857743 | 1 | |
| 0.9999807865 | 1 | |
| 0.9999792912 | 1 | |
| 0.999972055 | 1 | |
| 0.9999714264 | 1 |
predicted.sum
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.0103462 |
| Minimum | 1.5824651 × 10-5 |
|---|---|
| Maximum | 9.9999084 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1.5824651 × 10-5 |
|---|---|
| 5-th percentile | 0.50566384 |
| Q1 | 2.5122413 |
| median | 5.0181956 |
| Q3 | 7.5088777 |
| 95-th percentile | 9.5032151 |
| Maximum | 9.9999084 |
| Range | 9.9998925 |
| Interquartile range (IQR) | 4.9966365 |
Descriptive statistics
| Standard deviation | 2.8844607 |
|---|---|
| Coefficient of variation (CV) | 0.57570087 |
| Kurtosis | -1.197592 |
| Mean | 5.0103462 |
| Median Absolute Deviation (MAD) | 2.4983252 |
| Skewness | -0.0047710748 |
| Sum | 1063967.1 |
| Variance | 8.3201134 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.392973721 | 1 | < 0.1% |
| 4.324298833 | 1 | < 0.1% |
| 6.243358881 | 1 | < 0.1% |
| 2.650631148 | 1 | < 0.1% |
| 5.586710605 | 1 | < 0.1% |
| 2.906425036 | 1 | < 0.1% |
| 9.385978417 | 1 | < 0.1% |
| 2.275804283 | 1 | < 0.1% |
| 5.400942902 | 1 | < 0.1% |
| 3.256649819 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 1.582465119 × 10-5 | 1 | |
| 0.0001127681052 | 1 | |
| 0.0001258443752 | 1 | |
| 0.0001910987888 | 1 | |
| 0.0001976880087 | 1 | |
| 0.000262373681 | 1 | |
| 0.0003145903409 | 1 | |
| 0.0003205559806 | 1 | |
| 0.0003493989846 | 1 | |
| 0.0003699696828 | 1 |
| Value | Count | Frequency (%) |
| 9.999908374 | 1 | |
| 9.999802292 | 1 | |
| 9.999761352 | 1 | |
| 9.999679984 | 1 | |
| 9.999619268 | 1 | |
| 9.999538163 | 1 | |
| 9.999525225 | 1 | |
| 9.999501612 | 1 | |
| 9.999480194 | 1 | |
| 9.999463328 | 1 |
all.sum
Real number (ℝ)
Unique 
| Distinct | 212354 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.0000851 |
| Minimum | 8.1482043 × 10-5 |
|---|---|
| Maximum | 9.9999872 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 8.1482043 × 10-5 |
|---|---|
| 5-th percentile | 0.50052138 |
| Q1 | 2.5059411 |
| median | 5.0035977 |
| Q3 | 7.4931057 |
| 95-th percentile | 9.4958729 |
| Maximum | 9.9999872 |
| Range | 9.9999057 |
| Interquartile range (IQR) | 4.9871646 |
Descriptive statistics
| Standard deviation | 2.8826847 |
|---|---|
| Coefficient of variation (CV) | 0.57652712 |
| Kurtosis | -1.1970479 |
| Mean | 5.0000851 |
| Median Absolute Deviation (MAD) | 2.4933649 |
| Skewness | -0.0023108606 |
| Sum | 1061788.1 |
| Variance | 8.309871 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.487877859 | 1 | < 0.1% |
| 7.666791428 | 1 | < 0.1% |
| 4.13362407 | 1 | < 0.1% |
| 3.070646043 | 1 | < 0.1% |
| 2.389189758 | 1 | < 0.1% |
| 3.035776214 | 1 | < 0.1% |
| 6.997966007 | 1 | < 0.1% |
| 9.420516518 | 1 | < 0.1% |
| 3.073373663 | 1 | < 0.1% |
| 1.482559286 | 1 | < 0.1% |
| Other values (212344) | 212344 |
| Value | Count | Frequency (%) |
| 8.148204313 × 10-5 | 1 | |
| 0.0001859963496 | 1 | |
| 0.0003688354131 | 1 | |
| 0.0003699593097 | 1 | |
| 0.000414254698 | 1 | |
| 0.0004323137693 | 1 | |
| 0.0004683989212 | 1 | |
| 0.0004721965414 | 1 | |
| 0.0005789316115 | 1 | |
| 0.0005834709547 | 1 |
| Value | Count | Frequency (%) |
| 9.999987182 | 1 | |
| 9.999951269 | 1 | |
| 9.999939693 | 1 | |
| 9.999878253 | 1 | |
| 9.999845089 | 1 | |
| 9.999826876 | 1 | |
| 9.999796655 | 1 | |
| 9.999788935 | 1 | |
| 9.999768537 | 1 | |
| 9.999766095 | 1 |
label
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 191395 | |
| 1 | 20959 | 9.9% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 191395 | |
| 1 | 20959 | 9.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 191395 | |
| 1 | 20959 | 9.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 191395 | |
| 1 | 20959 | 9.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 191395 | |
| 1 | 20959 | 9.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 212354 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 191395 | |
| 1 | 20959 | 9.9% |
Interactions
Correlations
| age | alcohol_consumption | all.sum | biopsy_results | ct_scan | diana_microt | dietary_habits | elmmo | endoscopic_images | ethnicity | existing_conditions | family_history | gender | geographical_location | helicobacter_pylori_infection | label | mature_mirna_acc | mature_mirna_id | microcosm | miranda | mirdb | pictar | pita | predicted.sum | smoking_habits | target_ensembl | target_entrez | target_symbol | targetscan | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| age | 1.000 | 0.000 | -0.001 | 0.003 | 0.006 | 0.003 | 0.004 | -0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.002 | 0.004 | 0.000 | 0.000 | 0.002 | 0.001 | 0.003 | -0.000 | 0.000 | -0.002 | 0.001 | 0.000 | 0.001 | -0.000 | 0.004 | -0.000 |
| alcohol_consumption | 0.000 | 1.000 | 0.000 | 0.003 | 0.002 | 0.000 | 0.001 | 0.004 | 0.001 | 0.000 | 0.003 | 0.000 | 0.000 | 0.004 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.005 | 0.000 | 0.004 | 0.004 | 0.000 | 0.003 | 0.002 | 0.003 |
| all.sum | -0.001 | 0.000 | 1.000 | 0.000 | 0.001 | 0.001 | 0.000 | 0.002 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.004 | 0.006 | 0.005 | 0.000 | 0.006 | 0.000 | 0.001 | 0.001 | 0.003 | -0.001 | -0.002 | 0.000 | -0.001 | 0.002 | 0.000 | 0.001 |
| biopsy_results | 0.003 | 0.003 | 0.000 | 1.000 | 0.000 | 0.007 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.005 | 0.004 | 0.000 | 0.000 |
| ct_scan | 0.006 | 0.002 | 0.001 | 0.000 | 1.000 | 0.000 | 0.000 | 0.003 | 0.002 | 0.004 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.001 | 0.004 | 0.002 | 0.007 | 0.003 | 0.004 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 |
| diana_microt | 0.003 | 0.000 | 0.001 | 0.007 | 0.000 | 1.000 | 0.007 | 0.003 | 0.000 | 0.007 | 0.000 | 0.003 | 0.000 | 0.005 | 0.000 | 0.001 | 0.000 | 0.000 | -0.003 | -0.003 | -0.001 | -0.001 | 0.000 | -0.001 | 0.003 | 0.000 | 0.002 | 0.000 | -0.002 |
| dietary_habits | 0.004 | 0.001 | 0.000 | 0.000 | 0.000 | 0.007 | 1.000 | 0.006 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.000 | 0.000 | 0.000 | 0.000 | 0.005 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.000 |
| elmmo | -0.001 | 0.004 | 0.002 | 0.003 | 0.003 | 0.003 | 0.006 | 1.000 | 0.005 | 0.000 | 0.006 | 0.000 | 0.005 | 0.002 | 0.005 | 0.002 | 0.000 | 0.000 | -0.002 | -0.001 | -0.001 | -0.000 | 0.002 | -0.003 | 0.005 | 0.002 | 0.000 | 0.004 | 0.001 |
| endoscopic_images | 0.000 | 0.001 | 0.000 | 0.000 | 0.002 | 0.000 | 0.003 | 0.005 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.002 | 0.002 | 0.006 | 0.005 | 0.002 | 0.007 | 0.000 | 0.002 | 0.000 | 0.005 | 0.000 | 0.004 | 0.002 |
| ethnicity | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.007 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.004 | 0.005 | 0.004 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.006 | 0.002 | 0.005 | 0.002 | 0.000 | 0.006 | 0.001 | 0.001 | 0.001 |
| existing_conditions | 0.000 | 0.003 | 0.003 | 0.000 | 0.002 | 0.000 | 0.000 | 0.006 | 0.000 | 0.000 | 1.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.002 | 0.003 | 0.000 | 0.000 | 0.000 | 0.002 | 0.005 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 |
| family_history | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 0.004 | 0.000 | 1.000 | 0.003 | 0.000 | 0.000 | 0.003 | 0.004 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.003 | 0.000 | 0.000 | 0.000 | 0.002 |
| gender | 0.004 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.005 | 0.000 | 0.005 | 0.001 | 0.003 | 1.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.003 | 0.000 | 0.004 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 |
| geographical_location | 0.002 | 0.004 | 0.004 | 0.000 | 0.000 | 0.005 | 0.000 | 0.002 | 0.001 | 0.004 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.004 | 0.000 | 0.000 | 0.005 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.000 | 0.002 |
| helicobacter_pylori_infection | 0.004 | 0.000 | 0.006 | 0.000 | 0.000 | 0.000 | 0.004 | 0.005 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 1.000 | 0.001 | 0.000 | 0.003 | 0.000 | 0.004 | 0.000 | 0.000 | 0.003 | 0.004 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 |
| label | 0.000 | 0.000 | 0.005 | 0.000 | 0.000 | 0.001 | 0.000 | 0.002 | 0.000 | 0.000 | 0.002 | 0.003 | 0.000 | 0.000 | 0.001 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.003 | 0.000 | 0.005 |
| mature_mirna_acc | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.003 | 0.004 | 0.000 | 0.004 | 0.000 | 0.000 | 1.000 | 0.002 | 0.004 | 0.007 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.003 | 0.003 |
| mature_mirna_id | 0.002 | 0.000 | 0.006 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.002 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.002 | 1.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 |
| microcosm | 0.001 | 0.000 | 0.000 | 0.000 | 0.001 | -0.003 | 0.000 | -0.002 | 0.006 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.001 | 1.000 | -0.002 | 0.002 | -0.000 | -0.001 | -0.001 | 0.001 | -0.000 | -0.001 | 0.003 | 0.002 |
| miranda | 0.003 | 0.003 | 0.001 | 0.000 | 0.004 | -0.003 | 0.005 | -0.001 | 0.005 | 0.000 | 0.000 | 0.000 | 0.002 | 0.005 | 0.004 | 0.000 | 0.007 | 0.000 | -0.002 | 1.000 | -0.001 | 0.001 | -0.001 | -0.000 | 0.000 | 0.003 | -0.004 | 0.004 | 0.002 |
| mirdb | -0.000 | 0.000 | 0.001 | 0.000 | 0.002 | -0.001 | 0.000 | -0.001 | 0.002 | 0.006 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.002 | -0.001 | 1.000 | 0.001 | -0.001 | -0.005 | 0.000 | 0.004 | 0.002 | 0.005 | -0.000 |
| pictar | 0.000 | 0.005 | 0.003 | 0.000 | 0.007 | -0.001 | 0.002 | -0.000 | 0.007 | 0.002 | 0.005 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | -0.000 | 0.001 | 0.001 | 1.000 | 0.002 | -0.001 | 0.000 | -0.001 | -0.002 | 0.000 | 0.002 |
| pita | -0.002 | 0.000 | -0.001 | 0.000 | 0.003 | 0.000 | 0.000 | 0.002 | 0.000 | 0.005 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.003 | -0.001 | -0.001 | -0.001 | 0.002 | 1.000 | -0.000 | 0.000 | 0.003 | -0.002 | 0.000 | -0.003 |
| predicted.sum | 0.001 | 0.004 | -0.002 | 0.000 | 0.004 | -0.001 | 0.000 | -0.003 | 0.002 | 0.002 | 0.000 | 0.004 | 0.004 | 0.000 | 0.004 | 0.000 | 0.000 | 0.000 | -0.001 | -0.000 | -0.005 | -0.001 | -0.000 | 1.000 | 0.001 | 0.001 | -0.002 | 0.003 | 0.000 |
| smoking_habits | 0.000 | 0.004 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.005 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 1.000 | 0.006 | 0.000 | 0.000 | 0.000 |
| target_ensembl | 0.001 | 0.000 | -0.001 | 0.005 | 0.000 | 0.000 | 0.000 | 0.002 | 0.005 | 0.006 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | -0.000 | 0.003 | 0.004 | -0.001 | 0.003 | 0.001 | 0.006 | 1.000 | 0.002 | 0.000 | 0.001 |
| target_entrez | -0.000 | 0.003 | 0.002 | 0.004 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.004 | 0.003 | 0.003 | 0.004 | 0.000 | -0.001 | -0.004 | 0.002 | -0.002 | -0.002 | -0.002 | 0.000 | 0.002 | 1.000 | 0.004 | -0.001 |
| target_symbol | 0.004 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.004 | 0.004 | 0.004 | 0.001 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.000 | 0.003 | 0.000 | 0.003 | 0.004 | 0.005 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.004 | 1.000 | 0.000 |
| targetscan | -0.000 | 0.003 | 0.001 | 0.000 | 0.004 | -0.002 | 0.000 | 0.001 | 0.002 | 0.001 | 0.000 | 0.002 | 0.000 | 0.002 | 0.000 | 0.005 | 0.003 | 0.003 | 0.002 | 0.002 | -0.000 | 0.002 | -0.003 | 0.000 | 0.000 | 0.001 | -0.001 | 0.000 | 1.000 |
Missing values
Sample
| age | gender | ethnicity | geographical_location | family_history | smoking_habits | alcohol_consumption | helicobacter_pylori_infection | dietary_habits | existing_conditions | endoscopic_images | biopsy_results | ct_scan | mature_mirna_acc | mature_mirna_id | target_symbol | target_entrez | target_ensembl | diana_microt | elmmo | microcosm | miranda | mirdb | pictar | pita | targetscan | predicted.sum | all.sum | label | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 43 | Male | Ethnicity_A | Other | 1 | 0 | 0 | 0 | Low_Salt | Chronic Gastritis | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 5034 | 1946626 | 0.381326 | 0.187003 | 0.786422 | 0.204816 | 0.561920 | 0.438175 | 0.283603 | 0.928244 | 4.324299 | 7.666791 | 0 |
| 1 | 86 | Female | Ethnicity_B | California | 1 | 0 | 0 | 1 | High_Salt | Diabetes | Normal | Negative | Negative | MIR123 | MIR234_2 | TP53 | 6901 | 1503178 | 0.958561 | 0.493322 | 0.963989 | 0.498041 | 0.985585 | 0.144609 | 0.375375 | 0.103573 | 7.967674 | 1.483280 | 0 |
| 2 | 68 | Male | Ethnicity_A | California | 0 | 1 | 1 | 0 | High_Salt | NaN | Normal | Negative | Negative | MIR345 | MIR345_3 | KRAS | 4014 | 1787909 | 0.269537 | 0.573560 | 0.666896 | 0.540388 | 0.905853 | 0.827279 | 0.350915 | 0.166878 | 3.748651 | 3.046783 | 0 |
| 3 | 57 | Female | Ethnicity_A | Other | 0 | 0 | 0 | 1 | High_Salt | Chronic Gastritis | Normal | Negative | Negative | MIR123 | MIR123_1 | KRAS | 7351 | 1310766 | 0.372287 | 0.261399 | 0.949488 | 0.134170 | 0.429935 | 0.935231 | 0.794704 | 0.867036 | 5.478298 | 8.811307 | 0 |
| 4 | 33 | Male | Ethnicity_A | California | 0 | 1 | 1 | 0 | High_Salt | Diabetes | Abnormal | Negative | Negative | MIR345 | MIR123_1 | CDH1 | 7982 | 1277058 | 0.974656 | 0.754478 | 0.263164 | 0.876767 | 0.650832 | 0.337669 | 0.427492 | 0.915804 | 1.809181 | 0.394632 | 0 |
| 5 | 33 | Female | Ethnicity_C | California | 1 | 0 | 1 | 0 | High_Salt | Chronic Gastritis | Abnormal | Negative | Negative | MIR123 | MIR234_2 | TP53 | 4858 | 1665985 | 0.906934 | 0.050714 | 0.954498 | 0.928561 | 0.349869 | 0.768395 | 0.357806 | 0.464772 | 4.470495 | 8.478575 | 0 |
| 6 | 26 | Male | Ethnicity_B | California | 0 | 1 | 0 | 0 | High_Salt | Chronic Gastritis | Abnormal | Negative | Negative | MIR123 | MIR123_1 | KRAS | 6192 | 1349266 | 0.387300 | 0.159288 | 0.973591 | 0.755524 | 0.193466 | 0.902805 | 0.117261 | 0.884740 | 4.022234 | 9.744591 | 0 |
| 7 | 79 | Male | Ethnicity_B | California | 1 | 0 | 0 | 0 | High_Salt | Diabetes | Normal | Negative | Positive | MIR123 | MIR123_1 | CDH1 | 2249 | 1070986 | 0.067839 | 0.895501 | 0.483289 | 0.047487 | 0.747612 | 0.020567 | 0.376179 | 0.236240 | 9.220684 | 1.084167 | 0 |
| 8 | 58 | Male | Ethnicity_A | California | 1 | 0 | 0 | 1 | High_Salt | NaN | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 5305 | 1782837 | 0.918927 | 0.934111 | 0.389257 | 0.690765 | 0.228321 | 0.348682 | 0.180423 | 0.161433 | 5.490160 | 9.528053 | 0 |
| 9 | 66 | Female | Ethnicity_B | California | 0 | 1 | 1 | 1 | High_Salt | Chronic Gastritis | Normal | Negative | Positive | MIR123 | MIR123_1 | TP53 | 8828 | 1895030 | 0.464553 | 0.467400 | 0.992538 | 0.610410 | 0.204009 | 0.339347 | 0.790507 | 0.923155 | 5.312960 | 6.509198 | 1 |
| age | gender | ethnicity | geographical_location | family_history | smoking_habits | alcohol_consumption | helicobacter_pylori_infection | dietary_habits | existing_conditions | endoscopic_images | biopsy_results | ct_scan | mature_mirna_acc | mature_mirna_id | target_symbol | target_entrez | target_ensembl | diana_microt | elmmo | microcosm | miranda | mirdb | pictar | pita | targetscan | predicted.sum | all.sum | label | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 212344 | 70 | Female | Ethnicity_A | California | 1 | 0 | 1 | 1 | High_Salt | Chronic Gastritis | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 4617 | 1121004 | 0.214053 | 0.985299 | 0.030505 | 0.476153 | 0.749060 | 0.922630 | 0.231277 | 0.891736 | 8.581466 | 2.303212 | 0 |
| 212345 | 40 | Male | Ethnicity_A | California | 0 | 0 | 0 | 0 | High_Salt | Diabetes | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 7311 | 1856009 | 0.854685 | 0.389023 | 0.579350 | 0.831756 | 0.987200 | 0.436073 | 0.096688 | 0.724573 | 2.914234 | 1.912806 | 1 |
| 212346 | 32 | Male | Ethnicity_B | California | 0 | 1 | 0 | 0 | High_Salt | Diabetes | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 7455 | 1809520 | 0.040721 | 0.707163 | 0.065486 | 0.345393 | 0.485852 | 0.779421 | 0.095927 | 0.158654 | 5.793998 | 8.948175 | 0 |
| 212347 | 27 | Male | Ethnicity_A | Other | 0 | 0 | 1 | 1 | High_Salt | Chronic Gastritis | Abnormal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 6579 | 1398335 | 0.847733 | 0.450773 | 0.545828 | 0.739607 | 0.929764 | 0.131652 | 0.880751 | 0.134566 | 3.610328 | 8.253089 | 0 |
| 212348 | 49 | Female | Ethnicity_C | California | 0 | 1 | 1 | 1 | High_Salt | Chronic Gastritis | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 5046 | 1582047 | 0.760967 | 0.713226 | 0.432300 | 0.507213 | 0.689894 | 0.998506 | 0.569885 | 0.978047 | 2.795402 | 3.935588 | 0 |
| 212349 | 47 | Male | Ethnicity_A | California | 0 | 1 | 1 | 0 | High_Salt | Chronic Gastritis | Abnormal | Negative | Negative | MIR123 | MIR123_1 | KRAS | 6934 | 1980281 | 0.972193 | 0.408123 | 0.925035 | 0.267628 | 0.538474 | 0.463471 | 0.827892 | 0.929147 | 0.866260 | 6.698303 | 0 |
| 212350 | 59 | Female | Ethnicity_A | California | 0 | 1 | 1 | 0 | High_Salt | Chronic Gastritis | Normal | Negative | Negative | MIR123 | MIR123_1 | TP53 | 6581 | 1287697 | 0.988244 | 0.578204 | 0.768279 | 0.339168 | 0.236523 | 0.444216 | 0.011762 | 0.053098 | 5.102216 | 2.338017 | 0 |
| 212351 | 58 | Female | Ethnicity_C | California | 0 | 0 | 0 | 0 | High_Salt | NaN | Abnormal | Negative | Negative | MIR123 | MIR123_1 | CDH1 | 9922 | 1115299 | 0.952403 | 0.940936 | 0.865221 | 0.448189 | 0.899351 | 0.766615 | 0.098454 | 0.384019 | 7.774612 | 4.601637 | 1 |
| 212352 | 77 | Female | Ethnicity_B | California | 1 | 1 | 0 | 1 | High_Salt | Chronic Gastritis | Abnormal | Negative | Negative | MIR123 | MIR123_1 | KRAS | 7558 | 1415023 | 0.924784 | 0.498088 | 0.048195 | 0.924610 | 0.101521 | 0.464988 | 0.639401 | 0.410327 | 2.362698 | 2.020047 | 0 |
| 212353 | 71 | Male | Ethnicity_A | California | 0 | 0 | 0 | 1 | High_Salt | NaN | Normal | Positive | Negative | MIR123 | MIR345_3 | KRAS | 3420 | 1547142 | 0.146955 | 0.689696 | 0.701524 | 0.250955 | 0.102458 | 0.077025 | 0.463766 | 0.765164 | 2.392974 | 4.487878 | 0 |